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Abstract 


Coronaviral infection of New World camelids was first identified in 1998 in Ilamas and alpacas with severe diarrhea. In order to understand this 
infection, one of the coronavirus isolates was sequenced and analyzed. It has a genome of 31,076 nt including the poly A tail at the 3’ end. This 
virus designated as ACoV-00-1381 (ACoV) encodes all 10 open reading frames (ORFs) characteristic of Group 2 bovine coronavirus (BCoV). 
Phylogenetic analysis showed that the ACoV genome is clustered closely (> 99.5% identity) with two BCoV strains, ENT and LUN, and was also 
closely related to other BCoV strains (Mebus, Quebec, DB2), a human corona virus (strain 043) (>96%), and porcine hemagglutinating 
encephalomyelitis virus (> 93% identity). A total of 145 point mutations and one nucleotide deletion were found relative to the BCoV ENT. Most 
of the ORFs were highly conserved; however, the predicted spike protein (S) has 9 and 12 amino acid differences from BCoV LUN and ENT, 
respectively, and shows a higher relative number of changes than the other proteins. Phylogenetic analysis suggests that ACoV shares the same 


ancestor as BCoV ENT and LUN. 
© 2007 Elsevier Inc. All rights reserved. 
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Introduction 


Coronaviruses are a genus in the family Coronaviridae, 
which are enveloped viruses with large, positive sense RNA 
genomes of 29 to 32 kb. They are important causes of human 
and animal diseases that include respiratory infections, gastro- 
enteritis, hepatic and neurological disorders, as well as immune- 
mediated diseases such as SARS and feline infectious peritonitis 
(reviewed in de Groot-Mijnes et al., 2005; Kahn, 2006; Spaan 
et al., 1988; Wege et al., 1982). The coronaviruses possess a 
characteristic genome composition. The 5’ two-thirds of the 
genome encodes two polyproteins (la and lab) that contain 
proteins necessary for RNA replication. The 3’ one-third 
encodes two non-structural proteins (NSI and 2) and several 
structural proteins, including a nucleocapsid protein (N) and 
three or four envelope proteins: the membrane (M), spike (S), 
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hemagglutinin-esterase (HE), and/or a small membrane (E) 
proteins. 

Originally, coronaviruses were classified into three groups on 
the basis of their antigenic cross reactivity (Cavanagh et al., 
1993; Cavanagh and Davis, 1993; Tsunemitsu et al., 1995). 
When coronavirus genome sequence data became available, the 
original antigenic groups were converted into three genetic 
groups based on the similarity of their nucleotide sequences 
(Gonzalez et al., 2003; Lai, 2003; Vijgen et al., 2006). Group 2 
coronaviruses include murine hepatitis virus (MHV), bovine 
coronaviruses (BCoV), human coronavirus OC43 (HCoV- 
OC43), rat sialodacryoadenitis virus, porcine hemagglutinating 
encephalomyelitis virus (PHEV), canine respiratory coronavirus 
and equine coronavirus. Fifteen strains of BCoV have been 
sequenced and have been implicated in a variety of diseases 
including respiratory and enteric infections (Chouljenko et al., 
2001a; Dea et al., 1995; Han et al., 2006; King and Brian, 1982; 
Park et al., 2006; Woloszyn et al., 1990). For example, the ENT 
and LUN strains were isolated from animals with fatal shipping 
fever pneumonia (Chouljenko et al., 2001la). The former was 
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associated with enteritis; whereas the latter 1s associated with 
respiratory infection (pneumonia). Mebus, Quebec and DB2 are 
virulent strains associated with neonatal calf diarrhea and winter 
dysentery in adult dairy cattle (Dea et al., 1995; Han et al., 2006; 
Park et al., 2006). 

Coronavirus associated with outbreaks of diarrhea in all age 
groups of New World camelids (llamas and alpacas) was first 
identified in 1998, in Oregon. The symptoms were similar to 
cattle infected with bovine coronavirus. Sick camelids showed 
varying degrees of severity of clinical disease, with some 
animals dying and others requiring intensive medical care. 
Coronavirus-like viruses were frequently isolated from the 
diseased animals (Cebra et al., 2003). Because of the severity of 
the infection and the ability of coronaviruses to cross the species 
barriers, including human—animal barriers, we undertook an 
investigation to further characterize this virus. In this report we 
describe the sequence of the ACoV and compare it to other 
members of the coronavirus genus. We found that it is closely 
related to Group 2 bovine coronaviruses. 


Results and discussion 
Cloning and sequence analysis 


A total of 21 overlapping cDNA fragments were generated by 
RT-PCR cDNA cloning to encompass the entire RNA genome. 
The oligonucleotide primer sets used for these amplifications are 
listed in Table 1. The 5’ and 3’ end of the genome were 
determined by 5’ and 3’RACE, respectively. This resulted in a 
genome sequence of 31,076 nt including a poly A tail of 38 nt. 
The genome encodes 10 ORFs characteristic of Group 2 BCoV 
and 5’ and 3’ untranslated sequences of 210 and 298 nts (which 
do not include the poly A tail), respectively (Fig. 1). The ORF 
coordinates are shown in Fig. 1. The predicted ORF la and ORF 
lab contained 13,152 and 21,282 nt, respectively (Fig. 1). ORF 
lab contains a 26 nt region that overlapped with ORF la and 
included a predicted ‘slippery’ sequence UUUAAAC. Based on 
evidence from other coronaviruses (Chouljenko et al., 2001b), 
this sequence causes a —1 frameshift during the translation of 
ORF La. This results in a portion of the proteins (called ORF lab) 
avoiding translation termination and containing an additional 
2711 amino acids. Upstream of five of the genes, there is a 
repeated intergenic sequence, UCUAAAC, that is predicted to 
interact with the viral transcriptase along with cellular factors to 
‘splice’ the leader sequence onto the start of each ORF. ORF- 
NS1, HE, S, M and N were all preceded by this sequence (Table 
2), which would be predicted to give rise to a nested set of 
mRNAs characteristic of the order Nidovirales. 


Sequence comparisons with bovine coronaviruses 


The sequence of ACoV was closely related to bovine 
coronaviruses (Table 3). It has 99.54% to 99.55% identity to the 
BCoV ENT and LUN strains, respectively. It was slightly less 
related to the Mebus and Quebec strains at about 98.5% (Table 
3). ACoV 1s also closely related to a human corona virus (strain 
0C43) (>96%) and porcine hemagglutinating encephalomye- 


Table 1 


199 


Oligonucleotide primer sets used for RT-PCR, 5’RACE and 3/RACE 


No. Name Primer sequences 
5’end SRACE1 CGCAGTGGTGGAGCATACTA 
SRACE2 GGCCACTGCCTAGGATACAA 
L. LCUF GATTGCGAGC GATTTGCGTG 
LC2016R GCAGATTTTATCTGCGTAGTCA 
os bOIZE TGCGTGATCCACGTTATGTT 
LC4092R CATTAGCAGGATTTACAACGACT 
2: LCS TT9F GTGCCATTTACAGCCCACTT 
LCSI138R. AGGCAAGCAATTCCTTCTGA 
4. LC4133F GGTGGTGTTGCAAAGGCTAT 
LC6993R AACCCACATCCTGAATGGAA 
oe LC6921F CGCACAGTGGATTAAGAGCA 
LC8738R ACAGCAACAACAATGGGACA 
6. LC8221F CAGCTGATTTAGGTGTTCTGA 
LCIO07 TR ATGTCTGGGACAGTAGACCT 
7. LC10000F GGTCTACTGTCCCAGACA 
LCIZ7 sik CAAGGAGGATCTAACTCCCA 
8. LCI2539F GCAAATCGGCATAATGAGGT 
LCI3913R ACCTCCACCAATTTGTCTGC 
7, LC13085F TCATATGGTGGTGCGTCTGT 
LC16485R TGCAGCAAACAATTTCAAGC 
10; LC16484F AGCGCTTGAAATTGTTTGCT 
LC17313R CAATTGAGCAGGATCACCAA 
LI; LCI7O71F CGTATTGTTCCTGCCAGGT 
LC19948R CTCCAACTTGTCCACCACT 
2. 19519F ACAGGACAGGCTGGTGAAAT 
20125R TTGCCAGAGCATCATTACCA 
13. LC20050F AGTGGTGGACAAGGTTGGAG 
LC22190R TCATCATTCTCGGGAAGGTC 
14. 21908F TCACTGATGCTGCACTTTCC 
23688R ATAACAGCAAAAGCCGTTGG 
15. LO22)90K TCATCATTCTCGGGAAGGTC 
LOZ 15ZF GTTAGTCCCGGTCTGTGCAT 
16. 20132F TGTGGTGATTATGCAGCATGT 
27183R CTACCACCAGCAATGCACAG 
We LOZ T1526 GTTAGTCCCGGTCTGTGCAT 
LC29816R GTCGGTGCCATACTGGTCTT 
18. LC29642F GACAAGGTGTGCCTATTGCA 
LC30662R GCTGATGTCCTCTGCAGTCA 
lee LC304953F TGAATAAACCCCGCCAGA 
LC30782F GAATGGATGTCTTGCTGCTA 
3’end 3RACE GAATCTTGACGAACCCCAGA 


litis virus (> 93% identity) (Fig. 2 and Table 3). Phylogenetic 
analysis of the entire genome of ACoV and other selected 


coronaviruses was carried out and the results are summarized in 
Fig. 2 and Table 3. 


Comparison of nine predicted BCoV and ACoV proteins 


Analysis of the predicted ORF la and ORF 1b shows 14 aa 
and 1 aa changes, respectively, compared to the corresponding 
ORFs in the most closely related BCoV ENT strain. The ACoV 
HE protein is phylogenetically similar to those of BCoV ENT 
and LUN, with one and three amino acid differences, 
respectively (Fig. 3B). In addition, the following differences 
were noted between ACoV and the most closely related BCoV 
strains for the structural proteins: membrane (M), one difference 
out of 229 aa; internal orf (I), two differences out of 207 aa; and 
N, one to five differences out of 448 aa. The nucleocapsid (N) 
protein of coronavirus has been used as an early diagnostic 
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Fig. 1. Map of the alpaca coronavirus genome (GenBank accession no. DQ915164) and ORFs of the viral coding sequence. NS1, non-structural protein gene 1; HE, 
hemagglutinin-esterase gene; S, spike gene; NS2, non-structural protein gene 2; E, envelope protein gene; M, membrane protein gene; N, nucleocapsid gene; I, internal 
ORF. The numbers above and below each ORF correspond to the nt coordinates of each gene. 


marker for coronaviruses such as the SARS virus (Che et al., 
2004). The highly conserved small membrane envelope (E) 
protein is unique to BCoV Group 2 and 1s identical between 
ACoV and the closely related BCoV strains. A similar pattern of 
relatedness was evident for the non-structural (NS) proteins. 
There were one to two aa differences in the predicted 32 kDa 
NS1 protein of ACoV compared to the most closely related 
BCoV ENT and LUN strains. This protein is not essential for 
virus replication in vitro, but has been implicated in virus 
pathogenicity (Vijgen et al., 2006). There were also one to two aa 
mutations between the predicted 12.7 kDa NS2 and those from 
the most closely related BCoV isolates. 


Comparison of the BCoV and ACoV predicted spike protein 


Because the S protein has been implicated in tissue tropism 
(Gallagher and Buchmeier, 2001; Godet et al., 1994; Schultze 
et al., 1991; Schultze and Herrler, 1994), its affinity may be 
reflected in the type of diseases caused. The spike protein of 
ACoV and BCoV is 1363 aa long. The predicted proteolytic 
cleavage site (KRRSRR) was conserved between all strains. 
Cleavage at this site results in the production of two subunits (S1 
and S2) in the well-characterized murine coronaviruses (de Haan 
et al., 2006). S1 and S2 of ACoV were predicted to be 760 and 
603 amino acids, respectively. The neighbor-joining method of 
molecular evolutionary analysis revealed that the ACoV S 
protein appears to evolve at a somewhat accelerated rate 
compared to other proteins, e.g., the HE (Fig. 3). The predicted 
ACoV S protein had 9 and 12 amino acid differences from the 
predicted S proteins of BCoV strains LUN and ENT, 
respectively. Most of the mutations occurred in the S1 subunit. 


Table 2 
Intergenic sequences upstream of ACoV genes 


Gene name Coding sequence  Intergenic sequence upstream of the AUG 
NS1 21,504—22,340 UCUAAACUUUAAAAAUG 

HE 22,352—23,626 ACUAAACUCAGUGAAAAUG 

S 23,641—27,732 AAUCUAAACAUG 

M 28,690—29,382 AAUCCAAACAUUAUG 

N 29,392—30,738 AUCUAAACUUUAAGGAUG 


Three ACoV S amino acid changes are particularly striking, 
including: serine at aa #174; proline at 565; and serine at 702. In 
all the other S proteins analyzed (Fig. 3A and Table 4), these 
amino acids replace a Pro, Leu and Leu, respectively. The loss of 
one proline (#174) and the gain of another (#565) could 
significantly alter the structure of ACoV relative to the homologs 
from the closely related viruses. It has been shown that S2 is not 
directly involved in receptor binding, indicating that changes in 
S1 could be involved in host specificity (de Haan et al., 2006). 


Conclusions 


The sequence analyses presented in this report demonstrate 
that ACoV-00-1381, which is associated with diarrhea in 
camelids (Cebra et al., 2003), is closely related to bovine 
coronaviruses. It appears to have been derived from the same 
ancestor as the LUN strain isolated from cattle with fatal 
shipping fever pneumonia and the ENT strain isolated from 
cattle with either pneumonia or enteritis (Chouljenko et al., 
2001a; Storz et al., 1996). New World camelids have been in 
contact with cattle for over 500 years in South America, 
compared to their relatively recent and small-scale introduction 
to North America. Although the identification of coronaviral 
infection and epidemic diarrhea in all age groups of llamas and 
alpacas occurred only recently, it is possible that the virus 
crossed between species during earlier interspecies contact or 1s 
a BCoV strain that is pathogenic for both bovids and camelids, 
although BCoV infection in camelids had not been described 
previous to the Cebra et al. (2003) report. 

The most significant difference that we observed between 
ACoV and BCoV strains was in the spike protein. The S protein 
forms distinctive surface projections on the virions and is 
responsible for the primary attachment of the virus to cell 
surface receptors (Schultze et al., 1991). It is a glycosylated and 
acetylated polypeptide with a molecular weight of 170 kDa to 
220 kDa and is the major hemagglutinin of bovine corona- 
viruses (Schultze and Herrler, 1994). It has been suggested that 
the high degree of variation in host range and tissue tropism of 
coronaviruses is largely attributable to variations in the S 
glycoprotein (Gallagher, 2001). There are a number of distinct 
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Table 3 
Sequence comparisons of ACoV with selected Group 2 coronaviruses 

ACoV-alpaca BCoV-DB2 BCoV-ENT BCoV-LUN BCoV-MEB BCoV-Q HCoV-OC43 PHEV-VW572 
ACoV-alpaca _ 99.18 99.54 99.55 98.53 98.48 96.52 93.4 
BCoV-DB2 _ 99.33 99.33 99.05 98.97 96.81 93.67 
BCoV-ENT — 99.66 98.73 98.66 96.62 93.56 
BCoV-LUN — 98.74 98.66 96.61 93.58 
BCoV-MEB ~ D912 96.94 93.67 
BCoV-Q — 96.87 93:59 
HCoV-OC43 — 93.5 


BCoV: bovine coronavirus; HCoV: human coronavirus; PHEV: porcine hemagglutinating encephalomyelitis virus; Q: Quebec; MEB: Mebus. 


differences in S proteins between the respiratory isolates and 
diarrhea isolates (of bovine coronaviruses, human and porcine). 
Although the importance of such variability in the virulence 
and tropism of BCoV is unknown, some amino acid changes 
could have significant effects on the conformation, charge, 
hydrophobicity and antigenic regions of the protein. These 
mutations could change either the protein folding or physico- 
chemical characteristics and could be involved in altering the 
host specificity of the virus. 


Material and methods 
Cells and virus 


An alpaca coronavirus isolate designated ACoV-00-1381 
was obtained from a diarrhea sample by the Veterinary 
Diagnostic Lab at Oregon State University (Cebra et al., 
2003). The isolate was grown on human rectal tumor (HRT- 
18G) cells, which were maintained in Dulbecco’s modified 
Eagle’s medium supplemented with 10% fetal bovine serum 
(Invitrogen), penicillin (100 U/ml) and streptomycin (100 pg/ml) 
(Sigma-Aldrich, Inc.) at 37 °C with 5% CO, in a humidified 
incubator. The virus was propagated in Dulbecco’s modified 


PHEV 
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BCoV-MEB 


BCoV-Q 
BCoV-DB2 


ACoV 
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BCoV-ENT 
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Fig. 2. A maximum-likelihood phylogenetic tree of selected coronavirus 
genomes. ACoV (GenBank accession no. DQ915164) was compared to other 
Group 2 Coronaviruses including porcine hemagglutinating encephalomyelitis 
virus (PHEV VW572, GenBank accession no. YP_459958); human coronavirus 
OC43 (HCoV-OC43, AY391777); bovine coronavirus Mebus strain (BCoV- 
Meb, U00735); bovine coronavirus Quebec strain (BCoV-Q, D00662); bovine 
coronavirus DB2 strain (BCoV-DB2, DQ811784); bovine coronavirus ENT 
strain (BCoV-ENT, Q91A22); bovine coronavirus LUN strain (BCoV-LUN, 
AF391542). The scale bar represents the genetic distance (nucleotide substitution 
per site). 


Eagle’s medium supplemented with 2.5 pg/ml trypsin and 
2.5 ug/ml pancreatin, 1 <x insulin—transferrin—selenium (Cat No. 
51500-056, GIBCO). 


Viral RNA preparation 


Cells were infected at a multiplicity of infection of 1 to 3 and 
were incubated for 3 to 5 days. The supernatant was collected 
and clarified with a bench-top Beckman Coulter Allegra 64R 
centrifuged at 9000<g for 30 min. Viruses were isolated from 
the supernatant by centrifuging through a 30% sucrose cushion 
with an ultracentrifuge (Beckman model XL-70) at 25,000 rpm 
for 2 h in an SW28 rotor. The pellets were re-suspended in 
100 wl TE buffer (10 mM Tris-HCl, 1 mM EDTA [pH 8.0]), 
and viral RNA was extracted with Trizol (Invitrogen) as 
described in the manufacturer’s instructions. 


RT-PCR amplification 


A one-step reverse transcriptase (RT)-PCR kit (Invitrogen) 
was used to generate viral DNA sequences for analysis. Four 
microliters of the RNA extract (0.5 g/pl) was added to the RT- 
PCR mixture (in total 20-ul reaction) containing final 
concentrations of 1.25 4M of each forward and reverse primer, 
1x buffer for RT-PCR, 0.1 mM MgSO, and 1 U of RT/Jaq. The 
RT reaction was performed at 50 °C for 45 min, then the reverse 
transcriptase was inactivated and Taq was activated at 94 °C for 
2 min. PCR was performed for 30 cycles as the following: 94 °C 
for 30s, 50 °C for 45 s, 72 °C for 2.5 min, followed by a 7 min 
elongation reaction at 72 °C after the final cycle. 


5'RACE 


The 5’ end of the viral genome was amplified by 5’RACE kit 
(Invitrogen), following the manufacturer’s instructions. Briefly, 
the first-strand cDNA was synthesized with gene-specific primer 
SRACE1 (Table 1). Approximately 1 to 3 wg of total viral RNA 
was used as the template in a 20-1 RT reaction containing the 
SRACE1 primer. After purification of the first-strand cDNA, the 
5’ end was tailed with dCTP using terminal deoxynucleotidyl- 
transferase. The oligo(dC) cDNA was then amplified with a 
second gene-specific primer (SRACE2) (Table 1) and the 
abridged anchor primer (AAP) specific for the 5’ dC tail. The 
primary PCR products were then reamplified with the hemi- 
nested gene-specific primer 5R ACE2 and the abridged universal 
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Fig. 3. A phylogenetic tree of selected Group 2 spike proteins and hemagglutinin-esterase amino acid sequences. (A) The S proteins from the selected strains were 
analyzed by the neighbor-joining method. (B) The hemagglutinin-esterase protein of the alpaca strain was compared to other selected Group 2 coronavirus 
hemagglutinin-esterase proteins. For information on the viruses analyzed, see legend to Fig. 2. 


amplification primer (AUAP), under conditions recommended 
by the manufacturer. The 5’RACE products were cloned and 
sequenced as described for the RT-PCR amplimers. 


3’RACE 


The 3’ end of the viral genome was amplified by 3’RACE kit 
(Invitrogen) following the manufacturer’s instructions. Briefly, the 
first-strand cDNA was synthesized with an oligo(dT)-containing 
adaptor primer (AP) to the end of the viral genome. Approxi- 


Table 4 
The amino acid differences of the S glycoprotein among ACoV and the two most 
related BCoV strains, ENT and LUN 


Predicted Amino acid 

> Biotein ACoV-00-1381 LUN ENT 
pests (ABI93999)° (AAL57308)? (NP_150077)? 
54 N * 
174 S P P 
179 R * Q 
370 D * Y 
483 P * S 
531 N D D 
565 P ie L 
571 Y H H 
702 S i i 
965 D E E 
1052 A i. x 
1082 D E E 
1180 G * D 
1242 D * Y 


“ Sequence accession number. 
* Same as ACoV strain. 


mately 1 to 3 wg of total viral RNA was used as the template in a 
20-11 RT reaction containing the oligo(dT)-AP primer. After the 
cDNA synthesis, 3 pl of the cDNA was used directly in PCR 
reaction with a gene-specific primer 3’RACE and abridged 
universal amplification primer (AUAP) in a reaction recom- 
mended by the manufacturer. The 3’RACE products were then 
cloned and sequenced as described for the RT-PCR amplimers. 


Primers 


The oligonucleotide primer sets used for RT-PCR, 5’RACE 
and 3’RACE, are listed in Table 1. Selection of the sequences used 
to design these primers for RT-PCR of the ACoV genome was 
based on the known genomic information of several strains of 
coronaviruses BCoV-LUN and BCoV-ENT (GenBank accession 
number AF391541 for ENT and AF391542 for LUN). They were 
selected by Primer 3 program available online. The walking 
primers (not shown) were selected from the ACoV sequence by 
the DNA sequence core facility at Oregon State University. 


TOPO TA cloning 


The RT-PCR, 5’RACE and 3’RACE products were cloned 
into the pCR 2.1-TOPO plasmid vector (vector) following the 
manufacturer’s instructions (Invitrogen). 


DNA sequencing and analysis 
Nucleotide sequences were determined by sequencing 


cloned plasmid DNA with the universal primers first, then 
using primer walking to complete the sequence. In addition, 
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three independently generated RT-PCR reaction products were 
combined and processed with a ChargeSwitch PCR Clean-Up 
kit (Invitrogen) before sequencing. Selected regions were 
reconfirmed by sequencing the RT-PCR products directly. All 
the cDNA clones and RT-PCR products were sequenced by the 
DNA sequence core facility at Oregon State University. The 
nucleotide sequences were assembled and analyzed with the 
EMBOSS software. Additional analyses were carried out using 
MacVector and programs described by Esteban et al. (2005). 


Nucleotide sequence accession number 


The sequences reported in this work have been deposited in 
the GenBank database under accession number DQ915164. 
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